|
The Akaike information criterion (AIC) is a measure of the relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to each of the other models. Hence, AIC provides a means for model selection. AIC is founded on information theory: it offers a relative estimate of the information lost when a given model is used to represent the process that generates the data. In doing so, it deals with the trade-off between the goodness of fit of the model and the complexity of the model. AIC does not provide a test of a model in the sense of testing a null hypothesis; i.e. AIC can tell nothing about the quality of the model in an absolute sense. If all the candidate models fit poorly, AIC will not give any warning of that. ==Definition== Suppose that we have a statistical model of some data. Let ''L'' be the maximum value of the likelihood function for the model; let ''k'' be the number of estimated parameters in the model. Then the AIC value of the model is the following.〔 : Given a set of candidate models for the data, ''the preferred model is the one with the minimum AIC value.'' Hence AIC rewards goodness of fit (as assessed by the likelihood function), but it also includes a penalty that is an increasing function of the number of estimated parameters. The penalty discourages overfitting (increasing the number of parameters in the model almost always improves the goodness of the fit). AIC is founded in information theory. Suppose that the data is generated by some unknown process ''f''. We consider two candidate models to represent ''f'': ''g''1 and ''g''2. If we knew ''f'', then we could find the information lost from using ''g''1 to represent ''f'' by calculating the Kullback–Leibler divergence, ''D''KL(''f'' ‖ ''g''1); similarly, the information lost from using ''g''2 to represent ''f'' could be found by calculating ''D''KL(''f'' ‖ ''g''2). We would then choose the candidate model that minimized the information loss. We cannot choose with certainty, because we do not know ''f''. showed, however, that we can estimate, via AIC, how much more (or less) information is lost by ''g''1 than by ''g''2. The estimate, though, is only valid asymptotically; if the number of data points is small, then some correction is often necessary (see AICc, below). 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Akaike information criterion」の詳細全文を読む スポンサード リンク
|